Multi-processor computer system

ABSTRACT

The present invention relates to a multi-processor computer system comprising at least two processors for parallel execution of processes, at least two cache memory units, each being associated with and connected to a separate processor, a connection bus connecting said processors and said cache memory units, and a process list unit connected to said connection line for storing a process list of processes to be available for execution by said processors. In order to enable power saving if no processes for execution are available while guaranteeing a fast wake-up procedure if such processes are available it is proposed according to the present invention that said processors are adapted for loading a global wake-up variable signalling process additions of processes to said process list into their associated cache memory unit, for switching into a low-power mode if said process list contains no process for execution by said processors and for switching into a normal-power mode if said wake-up variable signals an addition of a process to said process list. Thus, according to the present invention the cache coherence protocol is used for communicating and signalling the availability of processes for execution.

The present invention relates to a multi-processor computer systemcomprising

at least two processors for parallel execution of processes,

at least two cache memory units, each being associated with andconnected to a separate processor,

a connection bus connecting said processors and said cache memory units,and

a process list unit connected to said connection line for storing aprocess list of processes to be available for execution by saidprocessors.

Further, the present invention relates to a corresponding processor, amethod of scheduling the execution of processes and a method ofexecuting the process by a processor in such a multi-processor computersystem. Still further, the present invention relates to a computerprogram for implementing said methods.

Multi-processor computer systems execute multiple processes in parallel.Each processor repeatedly selects a process that is ready for executionand executes it until the process blocks or, in the case of pre-emptivescheduling, the time slice of the running process expires. When there isno process ready for selection by a processor or, particularly, itsassociated scheduler, the processor or its scheduler, respectively,waits in a spin loop until a ready process which is ready for executionbecomes available in the process list. A ready process becomes availableby an unblocking operation, e.g. a V semaphore operation, executed by aprocess running on another processor.

In order to save power consumption, it is preferred to let the processorswitch to a low-power or sleep mode rather than letting it spin until aready process becomes available. However, it is important that otherprocessors can wake-up sleeping processors without a large overhead. Thestandard way to wake-up a processor out of the sleeping mode is to sendan interrupt to it. The overhead of this method can be large for manyparallel applications that have a fine-grain synchronization.

It is therefore an object of the present invention to provide amulti-processor computer system, a corresponding processor, a method ofscheduling the execution of processes and a method of executing aprocess by a processor therein which provide a fast and efficient way ofexecuting processes wherein processors can be switched between alow-power mode and a normal-power mode within a very short time andwithout a large overhead.

This object is achieved according to the present invention by amulti-processor computer system as claimed in claim 1, wherein saidprocessors are adapted for loading a global wake-up variable signallingprocess additions of processes to said process list into theirassociated cache memory unit, for switching into a low-power mode ifsaid process list contains no process for execution by said processorsand for switching into a normal-power mode if said wake-up variablesignals an addition of a process to said process list.

The present invention is based on the idea to use the cache coherenceprotocol to wake-up sleeping processors. Cache coherence protocols aredesigned to communicate much faster than interrupts and therefore allowit to wake-up sleeping processors in a very efficient and fast way. Aglobal wake-up variable is introduced according to the invention whichis held by the cache memory units of the processors. Said wake-upvariable signals if a process has been added to the process list. If aprocessor adds a process to the process list this will be immediatelysignalled via the cache coherence protocol to the cache memory units ofthe processors causing the processors to switch from low-power mode intothe normal-power mode.

Preferred embodiments of the invention are defined in the dependentclaims. A processor for use in such a multi-processor computer system isdefined in claim 6. A method of scheduling the execution of processes isdefined in claim 7. A method of executing a process by a processor isdefined in claim 8. A computer program for implementing said methods isdefined in claim 9. It should be noted that these devices and methods aswell as the computer program can be developed further in a similar oridentical way as defined in the dependent claims of claim 1.

According to a first preferred embodiment as defined in claim 2switching into the normal-power mode of the processors is caused by achange of the wake-up variable due to an addition of a process to theprocess list. A processor adding a process to the process list thussimply has to change the wake-up variable, e.g. by executing a storecommand as claimed according to the preferred embodiment of claim 3 andwriting any new value into said variable. This will immediately besignalled to all cache memory units holding said wake-up variablecausing a switching of the associated processors from low-power modeinto normal-power mode.

According to another aspect of the invention the processors are adaptedto send a request to other processors to drop the wake-up variable fromtheir associate cache memory unit when adding a process to said processlist. Also in this way other processors will immediately be informed ofan addition of a new process to the process list and thus switch intothe normal-power mode in which they will try to get the process from theprocess list for execution.

Preferably, an invalidation-based cache coherence protocol isimplemented in the multi-processor computer system according to theinvention. This means that on a read command from a memory unit othercache memory units are checked to see whether they contain a more up todate version of the data than is in the memory unit. If this is the casethe processor holding the more up to date version of the data providesit to the memory. On a write command to data in a cache memory unit,other processors are checked to see whether they cache the same dataitem. If this is the case, they should invalidate the data item, i.e.remove it from their cache memory unit. Regarding more details of cachecoherence protocols, and, in particular, invalidation-based cachecoherence protocols reference is made to John L. Hennessy and David A.Patterson, “Computer architecture, a quantitative approach”, MorganKaufman Publishers, second edition, in particular chapter 8.3.

The invention will now be explained in more detail with reference to thedrawings in which

FIG. 1 shows a block diagram of a known multi-processor computer system,

FIG. 2 shows a flow chart of known method of scheduling the execution ofprocesses,

FIG. 3 shows a flow chart of a method of scheduling the execution ofprocesses according to the invention,

FIG. 4 shows a flow chart of the method of adding a process to a processlist according to the invention, and

FIG. 5 shows a block diagram of a multi-processor computer systemaccording to the invention.

FIG. 1 shows a block diagram of a known multi-processor computer system.Said computer system comprises a number of, in the present embodimentfour, processors 1, so-called central processing units (CPU), to each ofwhich a cache memory unit 2 is associated and connected. Further, ashared memory unit 3, for instance a random access memory unit,comprising a list of processes to be executed by said processors 1 isprovided. The processors 1 are interconnected via the cache memory units2 through an interconnection line 4, such as a bus, to which the memoryunit 3, which may also be regarded as comprising a process list unit,are also connected.

A known method of scheduling the execution of processes in such amulti-processor computer system is illustrated as flow chart in FIG. 2.The selection of a ready process, i.e. a process that is ready forexecution by a processor, consists of waiting until a process appears ina list of ready processes called “process list” (step S10). Multipleprocessors can be waiting for this so that the process list has toprotected by a lock since otherwise it is possible that, before theprocessor takes the ready process from the list, it is taken by anotherprocessor (S11). The ready process is then taken from the list in stepS12, whereafter the process list is unlocked again for access of otherprocessors which are trying to get processes for execution (S13). Incase the processor was successful in getting a process for executionfrom the process list (S14) it will execute this process, while in thenegative case it returns to the beginning where it is set into the stateof trying to get a process from the process list. The processors thatare currently not executing a process are therefore continuouslychecking if the process list is empty (S10) as a kind of stand-by stateor spin loop.

FIG. 3 shows an embodiment of a method of scheduling the execution ofprocesses according to the present invention in the form of a flowchart. According to the invention a global variable “wake-up” that isused to signal additions to the process list is introduced. It shall beassumed that, for the beginning, a processor is in a normal-power modeand looking for a ready process. In a first step S20 the processor loadsthe cache line containing the wake-up variable into its cache memory ifit is not already there by use of a normal load instruction. Next, theprocessor checks whether the process list is empty (S21). If theprocessor has found a ready process in the process list in step S21, itfirst locks the process list in step S22 to prevent access to saidprocess list by other processors. Next, the processor gets the processfrom the process list (S23), whereafter the process list is unlockedagain (S24).

If step S23 was successful the processor will execute the taken process(S25). The context of that process is restored and the process continuesexecution.

If step S23 was not successful, a so-called sleep-while-cached (swc)instruction will be executed (S27) with the wake-up variable asparameter. This means that the processor switches from its normal-powermode into a low-power mode, i.e. in some kind of sleeping mode, in whichit remains as long as the wake-up variable is in its associated cachememory unit or, to be more precise, as long as the cache line of itscache memory unit holds the wake-up variable. The same swc instructionis executed in case step S21 gives a positive results, i.e. if theprocess list is found empty (S26).

If, as shown in FIG. 4, another processor appends a process to theprocess list for execution it first locks the process list (S30), beforeit actually appends the process (S31). After unlocking the process listagain (S32), a store command will be performed on the wake-up variable,i.e. a new value will be assigned to the wake-up variable (S33). Thiswill immediately signal to all processors being in a low-power mode thata new process has been added to the process list and will cause aninvalidation of the cache line in the cache memory units of suchprocessors which then switch back from low-power mode to normal-powermode and start again with step S20 (see FIG. 3).

By this method much power can be saved since processors not executing aprocess are not waiting in a spin loop in normal-power mode but areswitched into a low-power mode. However, since according to the presentinvention a cache coherence protocol is used for signalling additions ofprocesses to the process list using said wake-up variable held in thecache memory units of sleeping processors, the wake-up procedure is veryfast, in particular faster than using interrupts.

A block diagram of a multi-processor computer system in which theinvention is implemented is shown in FIG. 5. Between the processor 1 andthe cache memory unit 2 additional communication lines 7, 8 besides thenormal data path 6 are added according to the present invention.Communication line 7 is used to communicate a wake-up address from theprocessor 1 to the cache memory unit 2, i.e. to pass the addressspecified by the swc instruction to the cache memory unit 2.Communication line 8 is used to communicate a wake-up signal from thecache memory unit 2 to the processor 1 to cause it to switch fromlow-power mode to normal-power mode, when the specified addressdisappears from the cache memory unit.

If, in step S33 of FIG. 4, a processor stores an arbitrary value to thewake-up variable which is a normal store instruction the followinghappens. If another processor is looking for a ready process then thatprocessor has the wake-up variable in its cache memory unit, meaningthat the cache contains the cache line that corresponds to the memoryblock in which the wakeup variable is stored. If another processor iscaching the wake-up variable then the processor intending to store anarbitrary value to the wake-up variable can not modify the wake-upvariable by means of such a store instruction since the cache coherenceprotocol prevents this. In order to do the store operation the processorsends out a broadcast to all other processor with a request to drop thewake-up variable from their cache memory unit. In terms of the cachecoherence protocol, in particular of the MSI, MESI or MOESI type, theprocessor makes a transition from shared or invalid to modified state.This causes processors that were sleeping after an swc instruction towake-up and switch into the normal-power mode. These processors willthen check the process list, and one of them will be successful ingetting the just added process. The others will switch back into thelow-power mode according to the swc instruction.

It should be noted that a processor loads the wake-up variable beforetrying to get a process from the process list. Doing this in the reverseorder might lead to the situation that the processor switches to thelow-power mode while there is a ready process in the process list.

Besides for saving power in the process scheduler the swc instructionaccording to the present invention as explained above could be usefulfor other purposes where fast synchronisation between processors isrequired. While many processors have instructions to switch to low-powersleep mode there is no processor and no multi-processor computer systemknown that is able to wake-up and switch into the normal-power modebecause of cache coherence transactions as proposed according to thepresent invention which provides a very fast and effective solution.

1. Multi-processor computer system comprising at least two processors(1) for parallel execution of processes, at least two cache memory units(2), each being associated with and connected to a separate processor(1), a connection bus (4) connecting said processors (1) and said cachememory units (2), and a process list unit (3) connected to saidconnection line (4) for storing a process list of processes to beavailable for execution by said processors (1), wherein said processors(1) are adapted for loading a global wake-up variable signalling processadditions of processes to said process list into their associated cachememory unit (2), for switching into a low-power mode if said processlist contains no process for execution by said processors (1) and forswitching into a normal-power mode if said wake-up variable signals anaddition of a process to said process list.
 2. Multi-processor computersystem as claimed in claim 1, wherein said processors (1) are adapted toswitch into the normal-power mode if the wake-up variable held in theassociated cache memory units (2) is changed due to an addition of aprocess to said process list.
 3. Multi-processor computer system asclaimed in claim 1, wherein said processors (1) are adapted to execute astore command on the wake-up variable when adding a process to saidprocess list.
 4. Multi-processor computer system as claimed in claim 1,wherein said processors (1) are adapted to send a request to otherprocessors to drop the wake-up variable from their associated cachememory unit when adding a process to said process list. 5.Multi-processor computer system as claimed in claim 1, wherein saidcomputer system is adapted for implementing an invalidation based cachecoherence protocol.
 6. Processor for use in a multi-processor computersystem comprising at least two processors (1) for parallel execution ofprocesses, at least two cache memory units (2), each being associatedwith and connected to a separate processor (1), a connection bus (4)connecting said processors (1) and said cache memory units (2), and aprocess list unit (3) connected to said connection line (4) for storinga process list of processes to be available for execution by saidprocessors (1), wherein said processor (1) is adapted for loading aglobal wake-up variable signalling process additions of processes tosaid process list into its associated cache memory unit (2), forswitching into a low-power mode if said process list contains no processfor execution by said processor and for switching into a normal-powermode if said wake-up variable signals an addition of a process to saidprocess list.
 7. Method of scheduling the execution of processes in amulti-processor computer system comprising at least two processors (1)for parallel execution of processes, at least two cache memory units(2), each being associated with and connected to a separate processor(1), a connection bus (4) connecting said processors (1) and said cachememory units (2), and a process list unit (3) connected to saidconnection line (4) for storing a process list of processes to beavailable for execution by said processors (1), said method comprisingthe steps of: loading a global wake-up variable signalling processadditions of processes to said process list by a processor (1) into itsassociated cache memory unit (2), adding a process to said process list,and changing the wake-up variable signalling said addition of a processto said process list thus causing said processor (1) to switch from alow-power mode into a normal-power mode.
 8. Method of executing aprocess by a processor in a multi-processor computer system comprisingat least two processors (1) for parallel execution of processes, atleast two cache memory units (2), each being associated with andconnected to a separate processor (1), a connection bus (4) connectingsaid processors (1) and said cache memory units (2), and a process listunit (3) connected to said connection line (4) for storing a processlist of processes to be available for execution by said processors (1),said method comprising the steps of: loading a global wake-up variablesignalling process additions of processes to said process list into anassociated cache memory unit (2), switching into a low-power mode ifsaid process list contains no process for execution by said processor(1), switching into a normal-power mode if said wake-up variable signalsan addition of a process to said process list, and accessing saidprocess list to get said added process for execution.
 10. Computerprogram comprising computer program code means for causing a computer toperform the steps of the method as claimed in claim 7 if said methodsare executed by said computer.